WO 2004/056983 



64 



PCT/GB2003/005664 



Sequence Information: 

SEQ ID NO: 1 (INSP005A nucleotide sequence exon 1) 

1 ATGGGTGGTA GTGGTGTCGT GGAGGTCCCC TTCCTGCTCT CCAGCAAGTA 

5 

51 CG 

SEQ ID NO: 2 (INSP005A protein sequence exon 1) 

1 MGGSGWEVP FLLSSKYD 

10 

SEQ ID NO: 3 (INSP005A nucleotide sequence exon 2) 

1 ATGAGCCCAG CCGCCAGGTC ATCCTGGAGG CTCTTGCGGA GTTTGAACGT 
51 TCCACGTGCA TCAGGTTTGT C AC CTATC AG GACCAGAGAG ACTTCATTTC 

15 

101 CATCATCCCC ATGTATGG 

SEQ ID NO: 4 (ENSP005A protein sequence exon 2) 

1 EPSRQVILEA LAEFERSTCI RFVTYQDQRD FISIIPMYG 

20 

SEQ ID NO: 5 (INSP005A nucleotide sequence exon 3) 

1 GTGCTTCTCG AGTGTGGGGC GCAGTGGAGG GATGCAGGTG GTCTCCCTGG 
51 CGCCCACGTG TCTCCAGAAG GGCCGGGGCA TTGTCCTTCA TGAGCTCATG 

25 

101 CATGTGCTGG GCTTCTGGCA CGAGCACACG CGGGCCGACC GGGACCGCTA 
151 TATCCGTGTC AACTGGAACG AGATCCTGCC AG 

30 SEQ ID NO: 6 (INSP005A protein sequence exon 3) 

1 CFSSVGRSGG MQWSLAPTC LQKGRGIVLH ELMHVLGFWH EHTRADRDRY 
51 IRVNWNEILP G 

35 SEQ ID NO: 7 (INSP005A nucleotide sequence exon 4) 

1 GCTTTGAAAT CAACTTCATC AAGTCTCAGA GCAGCAACAT GCTGACGCCC 
51 TATGACTACT CCTCTGTGAT GC ACT ATGGG AG 

40 SEQ ID NO: 8 (INSP005A protein sequence exon 4) 

1 FEINFIKSQS SNMLTPYDYS SVMHYGR 

SEQ ID NO: 9 (INSP005A nucleotide sequence exon 5) 

1 GCTCGCCTTC AGCCGGCGTG GGCTGCCCAC CATCACACCA CTTTGGGCCC 

45 

51 CCAGTGTCCA CATCGGCCAG CGATGGAACC TGAGTGCCTC GGACATCACC 
101 CGGGTCCTCA AACTCTACGG CTGCAGCCCA AGTGGCCCCA GGCCCCGTGG 
50 151 GAGAG 



10 
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SEQ ID NO: 10 (INSP005A protein sequence exon 5) 

1 LAFSRRGLPT ITPLWAPSVH IGQRWNLSAS DITRVLKLYG CSPSGPRPRG 
51 RG 

SEQ ID NO: 11 (INSP005A nucleotide sequence exon 6) 

1 GGTCCCATGC CCACAGCACT GGTAGGAGCC CCGCCCCGGC CTCCCTATCT 
51 CTGCAGCGGC TTTTGGAGGC ACTGTCGGCG GAATCCAGGA GCCCCGACCC 
101 CAGTGGTTCC AGTGCGGGAG GCCAGCCCGT TCCTGCAGGG CCTGGGGAGA 
151 GCCCACATGG GTGGGAGTCC CCTGCCCTGA AAAAGCTCAG TGCAGAGGCC 
15 201 TCGGCAAGGC AGCCTC AGAC CCTAGCTTCC TCCCCAAGAT CAAGGCCTGG 
251 AGCAGGTGCC CCCGGTGTTG CTCAGGAGCA GTCCTGGCTG GCCGGAGTGT 
301 CCACCAAGCC CACAGTCCCA TCTTCAGAAG CAGGAATCCA GCCAGTCCCT 

20 

351 GTCCAGGGAA GCCCAGCTCT GCCAGGGGGC TGTGTACCTA GAAATCATTT 
401 CAAGGGGATG TCCGAAGAT 

25 SEQ ID NO: 12 (INSP005A protein sequence exon 6) 

1 SHAHSTGRSP APASLSLQRL LEALSAESRS PDPSGSSAGG QPVPAGPGES 
51 PHGWESPALK KLSAEASARQ PQTLASSPRS RPGAGAPGVA QEQSWLAGVS 
30 101 TKPTVPSSEA GIQPVPVQGS PALPGGCVPR NHFKGMSED 



35 



40 



45 



50 



55 



60 



SEQ ID NO: 13 (BNSP005A full nucleotide sequence) 




l 


ATGGGTGGTA 


GTGGTGTCGT 


GGAGGTCCCC 


TTCCTGCTCT 


CCAGCAAGTA 


51 


CGATGAGCCC 


AGCCGCCAGG 


TCATCCTGGA 


GGCTCTTGCG 


GAGTTTGAAC 


101 


GTTCCACGTG 


CATCAGGTTT 


GTCACCTATC 


AGGACCAGAG 


AGACTTCATT 


151 


TCCATCATCC 


CCATGTATGG 


GTGCTTCTCG 


AGTGTGGGGC 


GCAGTGGAGG 


201 


GATGCAGGTG 


GTCTCCC TGG 


CGCCCACGTG 


TCTCCAGAAG 


GGCCGGGGCA 


251 


TTGTCCTTCA 


TGAGCTC ATG 


CATGTGCTGG 


GCTTCTGGCA 


CGAGCACACG 


301 


CGGGCCGACC 


GGGACCGCTA 


TATCCGTGTC 


AACTGGAACG 


AGATCCTGCC 


351 


AGGCTTTGAA 


ATCAACTTCA 


TCAAGTCTCA 


GAGCAGCAAC 


ATGCTGACGC 


401 


CCTATGACTA 


CTCCTCTGTG 


ATGCACTATG 


GGAGGCTCGC 


CTTCAGCCGG 


451 


CGTGGGCTGC 


CCACCATCAC 


ACCACTTTGG 


GCCCCCAGTG 


TCCACATCGG 


501 


CCAGCGATGG 


AACCTGAGTG 


CCTCGGACAT 


CACCCGGGTC 


CTCAAACTCT 


551 


ACGGCTGCAG 


CCCAAGTGGC 


CCCAGGCCCC 


GTGGGAGAGG 


GTCCCATGCC 


601 


CACAGCACTG 


GTAGGAGCCC 


CGCCCCGGCC 


TCCCTATCTC 


TGCAGCGGCT 


651 


TTTGGAGGCA 


CTGTCGGCGG 


AATCCAGGAG 


CCCCGACCCC 


AGTGGTTCCA 


701 


GTGC GGG AGG 


CCAGCCCGTT 


CCTGCAGGGC 


CTGGGGAGAG 


CCCACATGGG 
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751 TGGGAGTCCC CTGCCCTGAA AAAGCTCAGT GCAGAGGCCT CGGCAAGGCA 
801 GCCTCAGACC CTAGCTTCCT CCCCAAGATC AAGGCCTGGA GCAGGTGCCC 
5 851 CCGGTGTTGC TCAGGAGCAG TCCTGGCTGG CCGGAGTGTC CACCAAGCCC 
901 ACAGTCCCAT CTTCAGAAGC AGGAATCCAG CCAGTCCCTG TCCAGGGAAG 
951 CCCAGCTCTG CCAGGGGGCT GTGTACCTAG AAATCATTTC AAGGGGATGT 

10 

1001 CCGAAGAT 



SEQ ID NO: 14 (INSP005A full protein sequence) 

1 MGGSGWEVP FLLSSKYDEP SRQVILEALA EFERSTCIRF VTYQDQRDFI 

15 

51 SIIPMYGCFS SVGRSGGMQV VSLAPTCLQK GRGIVLHELM HVLGFWHEHT 
101 RADRDRYIRV NWNEILPGFE INFIKSQSSN MLTPYDYSSV MHYGRLAFSR 
20 151 RGLPTITPLW APSVHIGQRW NLSASDITRV LKLYGCSPSG PRPRGRGSHA 
201 HSTGRSPAPA SLSLQRLLEA LSAESRSPDP SGSSAGGQPV PAGPGESPHG 
251 WESPALKKLS AEASARQPQT LASSPRSRPG AGAPGVAQEQ SWLAGVSTKP 

25 

3 01 TVPSSEAGIQ PVPVQGSPAL PGGCVPRNHF KGMSED 

SEQ ID NO: 15 (DVSP005B nucleotide sequence exon 1) 

1 ATGGAGGGTG TAGGGGGTCT CTGGCC TTGG GTGCTGGGTC TGCTCTCCTT 

30 

51 GCCAG 

SEQ ID NO: 16 (INSP005B protein sequence exon 1) 

1 MEGVGGLWPW VLGLLSLPG 

35 

SEQ ID NO: 17 (INSP005B nucleotide sequence exon 2) 

1 GTGTGATCCT AGGAGCGCCC CTGGCCTCCA GCTGCGCAGG AGCCTGTGGT 
51 ACCAGCTTCC CAGATGGCCT CACCCCTGAG GG AACCCAGG CCTCCGGGGA 

40 

101 CAAGGACATT CCTGCAATTA ACCAAG 

SEQ ID NO: 18 (INSP005B protein sequence exon 2) 

1 VILGAPLASS CAGACGTSFP DGLTPEGTQA SGDKDIPAIN QG 

45 

SEQ ID NO: 19 (INSP005B nucleotide sequence exon 3) 

1 GGCTCATCCT GGAAGAAACC CCAGAGAGCA GCTTCCTCAT CGAGGGGGAC 
51 ATCATCCGGC CG 

50 

SEQ ID NO: 20 (INSP005B protein sequence exon 3) 

1 LILEETPESS FLIEGDIIRP 

SEQ ID NO: 21 (INSP005B nucleotide sequence exon 4) 

55 1 AGTCCCTTCC GACTGCTGTC AGCAACCAGC AACAAATGGC CCATGGGTGG 
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51 TAGTGGTGTC GTGGAGGTCC CCTTCCTGCT CTCCAGCAAG TACG 

SEQ ID NO: 22 (INSP005B protein sequence exon 4) 

5 1 SPFRLLSATS NKWPMGG SGV VEVPFLLSSK YD 

SEQ ID NO: 23 (INSP005B nucleotide sequence exon 5) 

1 ATGAGCCCAG CCGCCAGGTC ATCCTGGAGG CTCTTGCGGA GTTTGAACGT 
10 51 TCCACGTGCA TCAGGTTTGT CACCTATCAG GACCAGAGAG ACTTCATTTC 
101 CATCATCCCC ATGTATGG 

SEQ ID NO: 24 (INSP005B protein sequence exon 5) 

15 1 EPSRQVILEA LAEFERSTCI RFVTYQDQRD FISIIPMYG 

SEQ ID NO: 25 (INSP005B nucleotide sequence exon 6) 

1 GTGCTTCTCG AGTGTGGGGC GCAGTGGAGG GATGCAGGTG GTCTCCCTGG 
20 51 CGCCCACGTG TCTCCAGAAG GGCCGGGGCA TTGTCCTTCA TGAGCTCATG 
101 CATGTGCTGG GCTTCTGGCA CGAGCACACG CGGGCCGACC GGGACCGCTA 
151 TATC CGTGTC AACTGGAACG AGATCCTGCC AG 

25 

SEQ ID NO: 26 (ENSP005B protein sequence exon 6) 

1 CFSSVGRSGG MQWSLAPTC LQKGRGIVLH ELMHVLGFWH EHTRADRDRY 
51 IRVNWNEILP G 

30 

SEQ ID NO: 27 (INSP005B nucleotide sequence exon 7) 

1 GCTTTGAAAT CAACTTCATC AAGTCTCGGA GCAGCAACAT GCTGACGCCC 
51 TATGACTACT CCTCTGTGAT GCACTATGGG AG 

35 

SEQ ID NO: 28 (INSP005B protein sequence exon 7) 

1 FEINFIKSRS SNMLTPYDYS SVMHYGR 

SEQ ID NO: 29 (INSP005B nucleotide sequence exon 8) 

40 1 GCTCGCCTTC AGCCGGCGTG GGCTGCCCAC CATCACACCA CTTTGGGCCC 

51 CCAGTGTCCA C ATCGGC C AG CGATGGAACC TGAGTGCCTC GGACATCACC 
101 CGGGTCCTCA AACTCTACGG CTGCAGCCCA AGTGGCCCCA GGCCCCGTGG 

45 

151 GAG AG 

SEQ ID NO: 30 (INSP005B protein sequence exon 8) 

1 LAFSRRGLPT ITPLWAPSVH IGQRWNLSAS DITRVLKLYG CSPSGPRPRG 

50 

51 RG 
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SEQ ID NO: 31 (INSP005B nucleotide sequence exon 9) 

1 GGTCCCATGC CCACAGCACT GGTAGGAGCC CCGCTCCGGC CTCCCTATCT 
51 CTGCAGCGGC TTTTGGAGGC ACTGTCGGCG GAATCCAGGA GCCCCGACCC 

5 

101 CAGTGGTTCC AGTGCGGGAG GCCAGCCCGT TCCTGCAGGG CCTGGGGAGA 
151 GCCCACATGG GTGGGAGTCC CCTGCCCTGA AAAAGC TC AG TGCAGAGGCC 
10 201 TCGGCAAGGC AGCCTCAGAC CCTAGCTTCC TCCCCAAGAT CAAGGCCTGG 
251 AGCAGGTGCC CCCGGTGTTG CTCAGGAGCA GTCCTGGCTG GCCGGAGTGT 
301 CCACCAAGCC CACAGTCCCA TCTTCAGAAG CAGGAATCCA GCCAGTCCCT 

15 

351 GTCCAGGGAA GCCCAGCTCT GCCAGGGGGC TGTGTACCTA GAAATCATTT 
401 CAAGGGGATG TCCGAAGAT 

20 SEQ ID NO: 32 (INSP005B protein sequence exon 9) 

1 SHAHSTGRSP APASLSLQRL LEALSAESRS PDPSGSSAGG QPVPAGPGES 
51 PHGWES PALK KLSAEASARQ PQTLASSPRS RPGAGAPGVA QEQSWLAGVS 
25 101 TKPTVPSSEA GIQPVPVQGS PALPGGCVPR NHFKGMSED 



30 



35 



40 



45 



50 



55 



60 



SEQ ID NO: 33 (INSP005B full nucleotide sequence) 




l 


ATGGAGGGTG 


TAGGGGGTCT 


CTGGCCTTGG 


GTGCTGGGTC 


TGCTCTCCTT 


51 


GCCAGGTGTG 


ATC CTAGG AG 


CGCCCCTGGC 


CTCCAGCTGC 


GCAGGAGCCT 


101 


GTGGTACCAG 


CTTCCCAGAT 


GGCCTCACCC 


CTGAGGGAAC 


CCAGGCCTCC 


151 


GGGGACAAGG 


ACATTCCTGC 


AATTAACCAA 


GGGCTCATCC 


TGGAAGAAAC 


201 


CCCAGAGAGC 


AGCTTCCTCA 


TCGAGGGGGA 


CATCATCCGG 


CCGAGTCCCT 


251 


TCCGACTGCT 


GTCAGCAACC 


AGCAACAAAT 


GGCCCATGGG 


TGGTAGTGGT 


301 


GTCGTGGAGG 


TCCCCTTCCT 


GCTCTCCAGC 


AAGTACGATG 


AGCCCAGCCG 


351 


CCAGGTCATC 


CTGGAGGCTC 


TTGCGGAGTT 


TGAACGTTCC 


ACGTGCATCA 


401 


GGTTTGTCAC 


CTATCAGGAC 


CAGAGAGACT 


TCATTTCCAT 


CATCCCCATG 


451 


TATGGGTGCT 


TCTCGAGTGT 


GGGGCGCAGT 


GGAGGGATGC 


AGGTGGTCTC 


501 


CCTGGCGCCC 


ACGTGTCTCC 


AGAAGGGCCG 


GGGCATTGTC 


CTTCATGAGC 


551 


TCATGCATGT 


GCTGGGCTTC 


TGGCACGAGC 


ACACGCGGGC 


CGACCGGGAC 


601 


CGCTATATCC 


GTGTCAACTG 


GAACGAGATC 


CTGCCAGGCT 


TTGAAATCAA 


651 


CTTCATCAAG 


TCTCGGAGCA 


GCAACATGCT 


GACGCCCTAT 


GACTACTCCT 


701 


CTGTGATGCA 


CTATGGGAGG 


CTCGCCTTCA 


GCCGGCGTGG 


GCTGCCCACC 


751 


ATCACACCAC 


TTTGGGCCCC 


CAGTGTCCAC 


ATCGGCCAGC 


GATGGAACCT 


801 


GAGTGCCTCG 


GACATCACCC 


GGGTCCTCAA 


ACTCTACGGC 


TGCAGCCCAA 


851 


GTGGCCCCAG 


GCCCCGTGGG 


AGAGGGTCCC 


ATGCCCACAG 


CACTGGTAGG 


901 


AGCCCCGCTC 


CGGCCTCCCT 


ATCTCTGCAG 


CGGCTTTTGG 


AGGCACTGTC 
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951 GGCGGAATCC AGGAGCCCCG ACCCCAGTGG TTCCAGTGCG GGAGGCCAGC 
1001 CCGTTCCTGC AGGGCCTGGG GAGAGCCCAC ATGGGTGGGA GTCCCCTGCC 

5 

1051 CTGAAAAAGC TCAGTGCAGA GGCCTCGGCA AGGC AGCCTC AGACCCTAGC 
1101 TTCCTCCCCA AGATCAAGGC CTGGAGCAGG TGCCCCCGGT GTTGCTCAGG 
10 1151 AGCAGTCCTG GCTGGCCGGA GTGTCCACCA AGCCCACAGT CCCATCTTCA 
1201 GAAGCAGGAA TCCAGCCAGT CCCTGTCCAG GGAAGCCCAG CTCTGCCAGG 
1251 GGGCTGTGTA CCTAGAAATC ATTTCAAGGG GATGTCCGAA GAT 

15 

SEQ ID NO: 34 (INSP005B full protein sequence) 

1 MEGVGGLWPW VLGLLSLPGV ILGAPLASSC AGACGTSFPD GLTPEGTQAS 
51 GDKDIPAINQ GLILEETPES SFLIEGDIIR PSPFRLLSAT SNKWPMGGSG 

20 

101 WEVPFLLSS KYDEPSRQVI LEALAEFERS TCIRFVTYQD QRDFISIIPM 
151 YGCFSSVGRS GGMQWSLAP TCLQKGRGIV LHELMHVLGF WHEHTRADRD 
25 201 RYIRVNWNEI LPGFEINFIK SRSSNMLTPY DYSSVMHYGR LAFSRRGLPT 
251 ITPLWAPSVH IGQRWNLSAS DITRVLKLYG CSPSGPRPRG RGSHAHSTGR 
301 SPAPASLSLQ RLLEALSAES RSPDPSGSSA GGQPVPAGPG ESPHGWESPA 

30 

351 LKKLSAEASA RQPQTLASSP RSRPGAGAPG VAQEQSWLAG VSTKPTVPSS 
401 EAGIQPVPVQ GSPALPGGCV PRNHFKGMSE D 

35 SEQ ID NO: 35 (INSPOOSbmature nucleotide sequence) 



1 


GCGCCCCTGG 


CCTCCAGCTG 


CGCAGGAGCC 


TGTGGTACCA 


GCTTCCCAGA 


51 


TGGCCTCACC 


CCTGAGGGAA 


CCCAGGCCTC 


CGGGGACAAG 


GACATTCCTG 


101 


CAATTAACCA 


AGGGCTCATC 


C TGG AAG AAA 


CCCCAGAGAG 


CAGCTTCCTC 


151 


ATCGAGGGGG 


ACATCATCCG 


GCCGAGTCCC 


TTCCGACTGC 


TGTCAGCAAC 


201 


CAGCAACAAA 


TGGCCCATGG 


GTGGTAGTGG 


TGTCGTGGAG 


GTCCCCTTCC 


251 


TGCTCTCCAG 


CAAGTACGAT 


GAGCCCAGCC 


GCCAGGTCAT 


CCTGGAGGCT 


301 


CTTGCGGAGT 


TTGAACGTTC 


CACGTGCATC 


AGGTTTGTCA 


CCTATCAGGA 


351 


CCAGAGAGAC 


TTCATTTCCA 


TCATCCCCAT 


GTATGGGTGC 


TTCTCGAGTG 


401 


TGGGGCGCAG 


TGGAGGGATG 


CAGGTGGTCT 


CCCTGGCGCC 


CACGTGTCTC 


451 


CAGAAGGGCC 


GGGGCATTGT 


CCTTCATGAG 


CTCATGCATG 


TGCTGGGCTT 


501 


CTGGCACGAG 


C AC AC GCGGG 


CCGACCGGGA 


CCGCTATATC 


CGTGTCAACT 


551 


GGAACGAGAT 


CCTGCCAGGC 


TTTGAAATCA 


ACTTCATCAA 


GTCTCGGAGC 


601 


AGCAACATGC 


TGACGCCCTA 


TGACTACTCC 


TCTGTGATGC 


ACTATGGGAG 


651 


GCTCGCCTTC 


AGCCGGCGTG 


GGCTGCCCAC 


CATCACACCA 


CTTTGGGCCC 


701 


CCAGTGTCCA 


CATCGGCCAG 


CGATGGAACC 


TGAGTGCCTC 


GGACATCACC 


751 


CGGGTCCTCA 


AACTCTACGG 


CTGCAGCCCA 


AGTGGCCCCA 


GGCCCCGTGG 


801 


GAGAGGGTCC 


CATGCCCACA 


GCACTGGTAG 


GAGCCCCGCT 


CCGGCCTCCC 


851 


TATCTCTGCA 


GCGGCTTTTG 


GAGGCACTGT 


CGGCGGAATC 


CAGGAGCCCC 
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901 GACCCCAGTG GTTCCAGTGC GGGAGGCCAG CCCGTTCCTG CAGGGCCTGG 
951 GGAGAGCCCA CATGGGTGGG AGTCCCCTGC CCTGAAAAAG CTCAGTGCAG 

1001 AGGCCTCGGC AAGGCAGCCT CAGACCCTAG CTTCCTCCCC AAGATCAAGG 

1051 CCTGGAGCAG GTGCCCCCGG TGTTGCTCAG GAGCAGTCCT GGCTGGCCGG 
5 1101 AGTGTCCACC AAGCCCACAG TCCCATCTTC AGAAGCAGGA ATCCAGCCAG 

1151 TCCCTGTCCA GGGAAGCCCA GCTCTGCCAG GGGGC TGTGT ACCTAGAAAT 

1201 CATTTCAAGG GGATGTCCGA AGAT 



SEQ ID NO: 36 (BVSPOOSb mature polypeptide sequence) 

10 1 APLASSCAGA CGTSFPDGLT PEGTQASGDK DIPAINQGLI LEETPESSFL 

51 IEGDIIRPSP FRLLSATSNK WPMGGSGWE VPFLLSSKYD EPSRQVILEA 

101 LAEFERSTCI RFVTYQDQRD FISIIPMYGC FSSVGRSGGM QWSLAPTCL 

151 QKGRGIVLHE LMHVLGFWHE HTRADRDRY I RVNWNEILPG FEINFIKSRS 

201 SNMLTPYDYS SVMHYGRLAF SRRGLPTITP LWAPSVHIGQ RWNLSASDIT 

15 251 RVLKLYGCSP SG PR PRGRGS HAHSTGRSPA PASLSLQRLL EALSAESRSP 

301 DPSGSSAGGQ PVPAGPGESP HGWESPALKK LSAEASARQP QTLASSPRSR 

351 PGAGAPGVAQ EQSWLAGVST KPTVPSSEAG IQPVPVQGSP ALPGGCVPRN 
401 HFKGMSED 



20 SEQ ID NO: 37 (INSP00S Predicted Polypeptide Sequence) 

1 MLRLWDFNPG GALSDLALGL RGMEEGGYSC AGACGTSFPD GLTPEGTQAS GDKDIPAINQ 

61 GLILEETPES SFLIEGDIIR PSPFRLLSAT SNKWPMGGSG WEVPFLLSS KYDEPSHQVI 

121 LEALAEFERS TCIRFVTYQD QRDFISIIPM YGCFSSVGRS GGMQWSLAP TCLQKGRGIV 

181 LHELMHVLGF WHEHTRADRD RYIRVNWNEI LPGFEINFIK SQSSNMLTPY DYSSVMHYGR 

25 241 LAFSRRGLPT ITPLWAPSVH IGQRWNLSAS DITRVLKLYG CSPSGPRPRG RGEWHGRKVT 



SEQ ID NO: 38 (pCR4 TOPO IPAAA78836-1 plasmid nucleotide sequence) 



1 


AGCGCCCAAT 


ACGCAAACCG 


CCTCTCCCCG 


CGCGTTGGCC 


GATTCATTAA 


TGCAGCTGGC 


61 


ACGACAGGTT 


TCCCGACTGG 


AAAGCGGGCA 


GTGAGCGCAA 


CGCAATTAAT 


GTGAGTTAGC 


121 


TC AC TC ATT A 


GGCACCCCAG 


GCTTTACACT 


TTATGCTTCC 


GGCTCGTATG 


TTGTGTGGAA 


181 


TTGTGAGCGG 


ATAACAATTT 


CACACAGGAA 


ACAGCTATGA 


CCATGATTAC 


GCCAAGCTCA 


241 


GAATTAACCC 


TCACTAAAGG 


GACTAGTCCT 


GCAGGTTTAA 


ACGAATTCGC 


CCTTAGCCAC 


301 


AGGCTTAATC 


TTCGGACATC 


CCCTTGAAAT 


GATTTCTAGG 


TACACAGCCC 


CCTGGCAGAG 


361 


CTGGGCTTCC 


CTGGACAGGG 


ACTGGCTGGA 


TTCCTGCTTC 


TGAAGATGGG 


ACTGTGGGCT 


421 


TGGTGGACAC 


TCCGGCCAGC 


CAGGACTGCT 


CCTGAGCAAC 


ACCGGGGGCA 


CCTGCTCCAG 


481 


GCCTTGATC T 


TGGGGAGGAA 


GCTAGGGTCT 


GAGGCTGCCT 


TGCCGAGGCC 


TCTGCACTGA 


541 


GCTTTTTCAG 


GGCAGGGGAC 


TCCCACCCAT 


GTGGGCTCTC 


CCCAGGCCCT 


GCAGGAACGG 


601 


GCTGGCCTCC 


CGCACTGGAA 


CCACTGGGGT 


CGGGGCTCCT 


GGATTCCGCC 


GACAGTGCCT 


661 


CCAAAAGCCG 


CTGCAGAGAT 


AGGGAGGCCG 


GGGCGGGGCT 


CCTACCAGTG 


CTGTGGGCAT 
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721 


GGGACCCTCT 


CCCACGGGGC 


CTGGGGCCAC 


TTGGGCTGCA 


GCCGTAGAGT 


TTGAGGACCC 




781 


GGGTGATGTC 


CGAGGCACTC 


AGGTTCCATC 


GCTGGCC GAT 


GTGGACACTG 


GGGGCCCAAA 




841 


GTGGTGTGAT 


GGTGGGCAGC 


CCACGCCGGC 


TGAAGGCGAG 


CCTCCCATAG 


TGCATCACAG 




901 


AGGAGTAGTC 


ATAGGGCGTC 


AGCATGTTGC 


TGCTCTGAGA 


CTTGATGAAG 


TTGATTTCAA 


5 


961 


AGCCTGGCAG 


GATCTCGTTC 


CAGTTGACAC 


GGATATAGCG 


GTCCCGGTCG 


GCCCGCGTGT 




1021 


GCTCGTGCCA 


GAAGCCCAGC 


ACATGCATGA 


GCTCATGAAG 


GACAATGCCC 


CGGCCCTTCT 




1081 


GGAGACACGT 


GGGCGCCAGG 


GAGACCACCT 


GCATCCCTCC 


ACTGCGCCCC 


ACACTCGAGA 




1141 


AGCACCCATA 


CATGGGGATG 


ATGGAAATGA 


AGTCTCTCTG 


GTCCTGATAG 


GTGACAAACC 




1201 


TGATGCACGT 


GGAACGTTCA 


AACTCCGCAA 


GAGCCTCCAG 


GATGACCTGG 


CGGC TGGGC T 


10 


1261 


CATCGTACTT 


GCTGGAGAGC 


AGGAAGGGGA 


CCTCCACGAC 


ACCACTACCA 


CCCATGGGCC 




1321 


ATTTGTTGCT 


GGTTGCTGAC 


AGAAGGGCGA 


ATTCGCGGCC 


GCTAAATTCA 


ATTCGCCCTA 




1381 


TAGTGAGTCG 


TATTACAATT 


CACTGGCCGT 


CGTTTTACAA 


CGTCGTGACT 


GGGAAAACCC 




1441 


TGGCGTTACC 


CAACTTAATC 


GCCTTGCAGC 


ACATCCCCCT 


TTCGCCAGCT 


GGCGTAATAG 




1501 


CGAAGAGGCC 


CGCACCGATC 


GCCCTTCCCA 


ACAGTTGCGC 


AGCCTATACG 


TACGGCAGTT 


15 


1561 


TAAGGTTTAC 


ACCTATAAAA 


GAGAGAGCCG 


TTATC GTCTG 


TTTGTGGATG 


TACAGAGTGA 




1621 


TATTATTGAC 


ACGCCGGGGC 


GACGGATGGT 


GATCCCCCTG 


GCCAGTGCAC 


GTCTGCTGTC 




1681 


AGATAAAGTC 


TCCCGTGAAC 


TTTACCCGGT 


GGTGC AT ATC 


GGGGATGAAA 


GCTGGCGCAT 




1741 


GATGACCACC 


GATATGGCCA 


GTGTGCCGGT 


CTCCGTTATC 


GGGGAAGAAG 


TGGCTGATCT 




1801 


CAGCCACCGC 


GAAAATGACA 


TCAAAAACGC 


CATTAACCTG 


ATGTTCTGGG 


GAATATAAAT 


20 


1861 


GTCAGGCATG 


AGATTATCAA 


AAAGGATCTT 


CACCTAGATC 


CTTTTCACGT 


AGAAAGCCAG 




1921 


TCCGCAGAAA 


CGGTGCTGAC 


CCCGGATGAA 


TGTCAGCTAC 


TGGGC TATCT 


GGACAAGGGA 




1981 


AAACGCAAGC 


GCAAAGAGAA 


AGCAGGTAGC 


TTGCAGTGGG 


CTTACATGGC 


GATAGCTAGA 




2041 


CTGGGCGGTT 


TTATGGACAG 


CAAGCGAACC 


GGAATTGCCA 


GCTGGGGCGC 


CCTCTGGTAA 




2101 


GGTTGGGAAG 


CCCTGCAAAG 


TAAACTGGAT 


GGCTTTCTCG 


CCGCCAAGGA 


TCTGATGGCG 


25 


2161 


CAGGGGATCA 


AGCTCTGATC 


AAGAGACAGG 


ATGAGGATCG 


TTTCGCATGA 


TTGAACAAGA 




2221 


TGG ATTGC AC 


GCAGGTTCTC 


CGGCCGCTTG 


GGTGGAGAGG 


CTATTCGGCT 


ATGACTGGGC 




2281 


ACAACAGACA 


ATCGGCTGCT 


CTGATGCCGC 


CGTGTTCCGG 


CTGTCAGCGC 


AGGGGCGCCC 




2341 


GGTTCTTTTT 


GTCAAGACCG 


ACCTGTCCGG 


TGCCCTGAAT 


GAACTGCAAG 


ACGAGGCAGC 




2401 


GCGGCTATCG 


TGGCTGGCCA 


CGACGGGCGT 


TCCTTGGGCA 


GCTGTGCTCG 


ACGTTGTCAC 


30 


2461 


TGAAGCGGGA 


AGGGACTGGC 


TGCTATTGGG 


CGAAGTGCCG 


GGGCAGGATC 


TCCTGTCATC 




2521 


TCACCTTGCT 


CCTGCCGAGA 


AAGTATCCAT 


CATGGCTGAT 


GCAATGCGGC 


GGCTGCATAC 




2581 


GCTTGATCCG 


GCTACCTGCC 


CATTCGACCA 


CCAAGCGAAA 


CATCGCATCG 


AGCGAGCACG 




2641 


TACTCGGATG 


GAAGCCGGTC 


TTGTCGATCA 


GGATGATCTG 


GACGAAGAGC 


ATCAGGGGCT 




2701 


CGCGCCAGCC 


GAACTGTTCG 


CCAGGCTCAA 


GGCGAGCATG 


CCCGACGGCG 


AGGATCTCGT 


35 


2761 


CGTGACCCAT 


GGCGATGCCT 


GCTTGCCGAA 


TATCATGGTG 


GAAAATGGCC 


GCTTTTCTGG 




2821 


ATTCATCGAC 


TGTGGCCGGC 


TGGGTGTGGC 


GGACCGCTAT 


CAGGACATAG 


CGTTGGCTAC 




2881 


CCGTGATATT 


GCTGAAGAGC 


TTGGCGGCGA 


ATGGGCTGAC 


CGCTTCCTCG 


TGCTTTACGG 




2941 


TATCGCCGCT 


CCCGATTCGC 


AGCGCATCGC 


CTTCTATCGC 


CTTCTTGACG 


AGTTCTTCTG 




3001 


AATTATTAAC 


GCTTAC AATT 


TCCTGATGCG 


GTATTTTCTC 


C TT AC GC ATC 


TGTGCGGTAT 


40 


3061 


TTCACACCGC 


ATACAGGTGG 


CACTTTTCGG 


GGAAATGTGC 


GCGGAACCCC 


TATTTGTTTA 




3121 


TTTTTCTAAA 


TACATTCAAA 


TATGTATCCG 


CTCATGAGAC 


AATAACCCTG 


ATAAATGCTT 
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3181 CAATAATATT GAAAAAGGAA GAGTATGAGT ATTCAACATT TCCGTGTCGC CCTTATTCCC 

3241 TTTTTTGCGG CATTTTGCCT TCCTGTTTTT GCTCACCCAG AAACGCTGGT GAAAGTAAAA 

3301 GATGCTGAAG ATCAGTTGGG TGCACGAGTG GGTTACATCG AACTGGATCT CAACAGCGGT 

3361 AAGATCCTTG AG AGTTTTC G CCCCGAAGAA CGTTTTCCAA TGATGAGCAC TTTTAAAGTT 

5 3421 CTGCTATGTG GCGCGGTATT ATCCCGTATT GACGCCGGGC AAGAGCAACT CGGTCGCCGC 

3481 ATACACTATT CTCAGAATGA CTTGGTTGAG TACTCACCAG TC AC AGAAAA GCATCTTACG 

3541 GATGGCATGA CAGTAAGAGA ATTATGCAGT GCTGCCATAA CCATGAGTGA TAACACTGCG 

3601 GCCAACTTAC TTCTGACAAC GATCGGAGGA CCGAAGGAGC TAACCGCTTT TTTGCACAAC 

3661 ATGGGGGATC ATGTAACTCG CCTTGATCGT TGGGAACCGG AGCTGAATGA AGCCATACCA 

10 3721 AACGACGAGC GTGACACCAC GATGCCTGTA GCAATGGCAA CAACGTTGCG C AAACT ATT A 

3781 ACTGGCGAAC TACTTACTCT AGCTTCCCGG CAACAATTAA TAG AC TGG AT GGAGGCGGAT 

3841 AAAGTTGCAG GACCACTTCT GCGCTCGGCC CTTCCGGCTG GCTGGTTTAT TGCTGATAAA 

3901 TCTGGAGCCG GTGAGCGTGG GTCTCGCGGT ATC ATTGC AG CACTGGGGCC AGATGGTAAG 

3961 CCCTCCCGTA TCGTAGTTAT CTACACGACG GGGAGTCAGG CAACTATGGA TGAACGAAAT 

15 4021 AGACAGATCG CTGAGATAGG TGCCTCACTG ATTAAGCATT GGTAACTGTC AGACCAAGTT 

4081 TACTCATATA TACTTTAGAT TGATTTAAAA CTTCATTTTT AATTTAAAAG GATCTAGGTG 

4141 AAGATCCTTT TTGATAATCT CATGACCAAA ATCCCTTAAC GTGAGTTTTC GTTCCACTGA 

4201 GCGTCAGACC CCGTAGAAAA GATCAAAGGA TCTTCTTGAG ATCCTTTTTT TCTGCGCGTA 

4261 ATCTGCTGCT TGCAAACAAA AAAACCACCG CTACCAGCGG TGGTTTGTTT GCCGGATCAA 

20 4321 GAGCTACCAA CTCTTTTTCC GAAGGTAACT GGCTTCAGCA GAGCGC AGAT ACC AAATACT 

4381 GTCCTTCTAG TGTAGCCGTA GTTAGGCCAC CACTTCAAGA ACTCTGTAGC ACCGCCTACA 

4441 TACCTCGCTC TGCTAATCCT GTTACCAGTG GCTGCTGCCA GTGGCGATAA GTCGTGTCTT 

4501 ACCGGGTTGG ACTCAAGACG ATAGTTACCG GATAAGGCGC AGCGGTCGGG CTGAACGGGG 

4561 GGTTCGTGCA CACAGCCCAG CTTGGAGCGA ACGACCTACA CCGAACTGAG ATACCTACAG 

25 4621 CGTGAGCTAT GAGAAAGCGC CACGCTTCCC GAAGGGAGAA AGGC GGAC AG GTATCCGGTA 

4681 AGCGGCAGGG TCGGAACAGG AGAGCGCACG AGGGAGCTTC CAGGGGGAAA CGCCTGGTAT 

4741 CTTTATAGTC CTGTCGGGTT TCGCCACCTC TGACTTGAGC GTCGATTTTT GTGATGCTCG 

4801 TCAGGGGGGC GGAGCCTATG GAAAAACGCC AGCAACGCGG CCTTTTTACG GTTCCTGGGC 

4861 TTTTGCTGGC CTTTTGCTCA CATGTTCTTT CCTGCGTTAT CCCCTGATTC TGTGGATAAC 

30 4921 CGTATTACCG CCTTTGAGTG AGCTGATACC GCTCGCCGCA GCC GAACGAC CGAGCGCAGC 

4981 GAGTCAGTGA GCGAGGAAGC GGAAG 



SEQ ID NO: 39 (XpCR4TOPO IPAAA78836-2 plasmid nucleotide sequence) 

1 AGCGCCCAAT ACGCAAACCG CCTCTCCCCG CGCGTTGGCC GATTCATTAA TGC AGCTGGC 

35 61 ACGACAGGTT TCCCGACTGG AAAGCGGGC A GTGAGCGCAA CGCAATTAAT GTGAGTTAGC 

121 TCACTCATTA GGCACCCCAG GCTTTACACT TTATGCTTCC GGCTCGTATG TTGTGTGGAA 

181 TTGTGAGCGG ATAACAATTT CACACAGGAA AC AGCTATGA CCATGATTAC GCCAAGCTCA 

241 GAATTAACCC TCACTAAAGG GACTAGTCCT GCAGGTTTAA ACGAATTCGC CCTTAGCCAC 

301 AGGCTTAATC TTCGGACATC CCCTTGAAAT GATTTCTAGG TACACAGCCC CCTGGC AGAG 

40 361 CTGGGCTTCC CTGGACAGGG ACTGGCTGGA TTCCTGCTTC TGAAGATGGG ACTGTGGGCT 
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421 


TGGTGGACAC 


TCCGGCCAGC 


CAGGACTGCT CCTGAGCAAC 


ACCGGGGGCA 


CCTGCTCCAG 




481 


GCCTTGATCT 


TGGGGAGGAA GCTAGGGTCT GAGGCTGCCT 


TGCCGAGGCC 


TCTGCACTGA 




541 


GCTTTTTCAG 


GGCAGGGGAC 


TCCCACCCAT 


GTGGGCTCTC 


CCCAGGCCCT 


GCAGGAACGG 




601 


GCTGGCCTCC 


CGCACTGGAA 


CCACTGGGGT 


CGGGGCTCCT 


GGATTCCGCC 


GACAGTGCCT 


5 


661 


CCAAAAGCCG 


CTGCAGAGAT 


AGGGAGGCCG 


GAGCGGGGCT 


CCTACCAGTG 


CTGTGGGCAT 




721 


GGGACCCTCT 


CCCACGGGGC 


CTGGGGCCAC 


TTGGGCTGCA 


GCCGTAGAGT 


TTGAGGACCC 




781 


GGGTGATGTC 


CGAGGCACTC 


AGGTTCCATC 


GCTGGCCGAT 


GTGGACACTG 


GGGGCCCAAA 




841 


GTGGTGTGAT 


GGTGGGCAGC 


CCACGCCGGC 


TGAAGGCGAG 


CCTCCCATAG 


TGCATCACAG 




901 


AGGAGTAGTC 


ATAGGGCGTC 


AGCATGTTGC 


TGCTCCGAGA 


CTTGATGAAG 


TTGATTTCAA 


10 


961 


AGCCTGGCAG 


GATCTCGTTC 


CAGTTGACAC 


GGATATAGCG 


GTCCCGGTCG 


GCCCGCGTGT 




1021 


GCTCGTGCCA 


GAAGCCCAGC 


ACATGCATGA 


GC TCATGAAG 


GACAATGCCC 


CGGCCCTTCT 




1081 


GGAGACACGT 


GGGCGCCAGG 


GAGACCACCT 


GCATCCCTCC 


ACTGCGCCCC 


ACACTCGAGA 




1141 


AGCACCCATA 


CATGGGGATG 


ATGGAAATGA 


AGTCTCTCTG 


GTCCTGATAG 


GTGACAAACC 




1201 


TGATGCACGT 


GG AACGTTCA 


AACTCCGCAA 


GAGCCTCCAG 


GATGACC TGG 


CGGCTGGGCT 


15 


1261 


CATCGTACTT 


GCTGGAGAGC 


AGGAAGGGGA 


CCTCCACGAC 


ACCACTACCA 


CCCATGGGCC 




1321 


ATTTGTTGCT 


GGTTGCTGAC 


AGCAGTCGGA 


AGGGACTCGG 


CCGGATGATG 


TCCCCCTCGA 




1381 


TGAGGAAGCT 


GCTCTCTGGG 


GTTTCTTCCA 


GGATGAGCCC 


TTGGTTAATT 


GCAGGAATGT 




1441 


CCTTGTCCCC 


GGAGGCCTGG 


GTTCCCTCAG 


GGGTGAGGCC 


ATCTGGGAAG 


CTGGTACCAC 




1501 


AGGCTCCTGC 


GCAGCTGGAG 


GCCAGGGGCG 


CTCCTAGGAT 


CACACCTGGC 


AAGGAGAGCA 


20 


1561 


GACCCAGCAC 


CCAAGGCCAG 


AGACCCCCTA 


CACCCTCCAT 


GGTAGAAAGG 


GCGAATTCGC 




1621 


GGCCGCTAAA 


TTCAATTCGC 


CCTATAGTGA 


GTCGTATTAC 


AATTCACTGG 


CCGTCGTTTT 




1681 


ACAACGTCGT 


GACTGGGAAA 


ACCCTGGCGT 


TACCCAACTT 


AATCGCCTTG 


CAGCACATCC 




1741 


CCCTTTCGCC 


AGCTGGCGTA 


ATAGCGAAGA 


GGCCCGCACC 


GATCGCCCTT 


CCCAACAGTT 




1801 


GCGCAGCCTA 


TACGTACGGC 


AGTTTAAGGT 


TTACACCTAT 


AAAAGAGAGA 


GCCGTTATCG 


25 


1861 


TCTGTTTGTG 


GATGTACAGA 


GTGATATTAT 


TGACACGCCG 


GGGCGACGGA 


TGGTGATCCC 




1921 


CCTGGCCAGT 


GCACGTCTGC 


TGTCAGATAA 


AGTCTCCCGT 


GAAC TTTACC 


CGGTGGTGCA 




1981 


TATCGGGGAT 


GAAAGCTGGC 


GCATGATGAC 


CACCGATATG 


GCCAGTGTGC 


CGGTCTCCGT 




2041 


TATCGGGGAA 


GAAGTGGCTG 


ATCTCAGCCA 


CCGCGAAAAT 


GACATCAAAA 


ACGCCATTAA 




2101 


CCTGATGTTC 


TGGGGAATAT 


AAATGTCAGG 


CATGAGATTA 


TCAAAAAGGA 


TCTTCACCTA 


30 


2161 


GATCCTTTTC 


ACGTAGAAAG 


CCAGTCCGCA 


G AAAC GGTGC 


TGACCCCGGA 


TGAATGTCAG 




2221 


CTACTGGGCT 


ATCTGGACAA 


GGGAAAACGC 


AAGCGCAAAG 


AGAAAGCAGG 


TAGCTTGCAG 




2281 


TGGGCTTACA 


TGGCGATAGC 


TAGACTGGGC 


GGTTTTATGG 


ACAGCAAGCG 


AACCGGAATT 




2341 


GCCAGCTGGG 


GCGCCCTCTG 


GTAAGGTTGG 


GAAGCCCTGC 


AAAGTAAACT 


GGATGGCTTT 




2401 


CTCGCCGCCA 


AGGATCTGAT 


GGCGCAGGGG 


ATCAAGCTCT 


GATCAAGAGA 


CAGGATGAGG 


35 


2461 


ATCGTTTCGC 


ATGATTGAAC 


AAGATGGATT 


GCACGCAGGT 


TCTCCGGCCG 


CTTGGGTGGA 




2521 


GAGGCTATTC 


GGCTATGACT 


GGGCACAACA 


GACAATCGGC 


TGCTCTGATG 


CCGCCGTGTT 




2581 


CCGGCTGTCA 


GCGCAGGGGC 


GCCCGGTTCT 


TTTTGTCAAG 


ACCGACCTGT 


CCGGTGCCCT 




2641 


GAATGAACTG 


CAAGACGAGG 


CAGCGCGGCT 


ATCGTGGCTG 


GCCACGACGG 


GCGTTCCTTG 




2701 


CGCAGCTGTG 


CTCGACGTTG 


TCACTGAAGC 


GGGAAGGGAC 


TGGCTGCTAT 


TGGGCGAAGT 


40 


2761 


GCCGGGGCAG 


GATCTCCTGT 


CATCTCACCT 


TGCTCCTGCC 


GAGAAAGTAT 


CCATCATGGC 




2821 


TGATGCAATG 


CGGCGGCTGC 


ATACGCTTGA 


TCCGGCTACC 


TGCCCATTCG 


ACCACCAAGC 
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2881 GAAACATCGC ATCGAGCGAG CACGTACTCG GATGGAAGCC GGTCTTGTCG ATCAGGATGA 
2941 TCTGGACGAA GAGCATCAGG GGCTCGCGCC AGCCGAACTG TTCGCCAGGC TCAAGGCGAG 
3001 CATGCCCGAC GGCGAGGATC TCGTCGTGAC CCATGGCGAT GCCTGCTTGC CGAATATCAT 
3061 GGTGGAAAAT GGCCGCTTTT CTGGATTCAT CGACTGTGGC CGGCTGGGTG TGGCGGACCG 
5 3121 CTATCAGGAC ATAGCGTTGG CTACCCGTGA TATTGCTGAA GAGCTTGGCG GCGAATGGGC 
3181 TGACCGCTTC CTCGTGCTTT ACGGTATCGC CGCTCCCGAT TCGCAGCGCA TCGCCTTCTA 
3241 TCGCCTTCTT GACGAGTTCT TCTGAATTAT TAACGCTTAC AATTTCCTGA TGCGGTATTT 
3301 TCTCCTTACG CATCTGTGCG GTATTTCACA CCGCATACAG GTGGCACTTT TCGGGGAAAT 
3361 GTGCGCGGAA CCCCTATTTG TTTATTTTTC TAAATACATT CAAATATGTA TCCGCTCATG 
10 3421 AGACAATAAC CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGTAT GAGTATTCAA 
3 481 CATTTCCGTG TCGCCCTTAT TCCCTTTTTT GCGGCATTTT GCCTTCCTGT TTTTGCTCAC 
3541 CCAGAAACGC TGGTGAAAGT AAAAGATGCT GAAGATCAGT TGGGTGCACG AGTGGGTTAC 
3 601 ATCGAACTGG ATCTCAACAG CGGTAAGATC CTTGAGAGTT TTCGCCCCGA AG AAC GTTTT 
3661 CCAATGATGA GCACTTTTAA AGTTCTGCTA TGTGGCGCGG TATTATCCCG TATTGACGCC 
15 3721 GGGCAAGAGC AACTCGGTCG CCGCATACAC TATTCTCAGA ATGACTTGGT TGAGTACTCA 
3781 CCAGTCACAG AAAAGC ATCT TACGGATGGC ATGACAGTAA GAGAATTATG CAGTGCTGCC 
3 841 ATAACCATGA GTGATAACAC TGCGGCCAAC TTACTTC TGA CAACGATCGG AGGACCGAAG 
3901 GAGCTAACCG CTTTTTTGCA CAACATGGGG GATCATGTAA CTCGCCTTGA TCGTTGGGAA 
3961 CCGGAGCTGA ATGAAGCCAT ACCAAACGAC GAGCGTGACA CCACGATGCC TGTAGCAATG 
20 4021 GCAACAACGT TGCGCAAACT ATTAACTGGC GAACTACTTA CTCTAGCTTC CCGGCAACAA 
4081 TTAATAGACT GGATGGAGGC GGATAAAGTT GCAGGACCAC TTCTGCGCTC GGCCCTTCCG 
4141 GCTGGCTGGT TTATTGCTGA TAAATCTGGA GCCGGTGAGC GTGGGTCTCG CGGTATCATT 
4201 GCAGCACTGG GGCCAGATGG TAAGCCCTCC CGTATCGTAG TTATCTACAC GACGGGGAGT 
4261 CAGGCAACTA TGGATGAACG AAATAGACAG ATCGCTGAGA TAGGTGCCTC ACTGATTAAG 
25 4321 CATTGGTAAC TGTCAGACCA AGTTTACTCA TATATACTTT AGATTGATTT AAAACTTC AT 
4381 TTTTAATTTA AAAGGATCTA GGTGAAGATC CTTTTTGATA ATCTCATGAC CAAAATCCCT 
4441 TAACGTGAGT TTTCGTTCCA CTGAGCGTCA GACCCCGTAG AAAAGATCAA AGGATCTTCT 
4501 TGAGATCCTT TTTTTCTGCG CGTAATCTGC TGCTTGCAAA CAAAAAAACC ACCGCTACCA 
4561 GCGGTGGTTT GTTTGCCGGA TCAAGAGCTA CCAACTCTTT TTCCGAAGGT AACTGGCTTC 
30 4621 AGC AG AGCGC AGATACCAAA TACTGTCCTT CTAGTGTAGC CGTAGTTAGG CCACCACTTC 
4681 AAGAACTCTG TAGCACCGCC TACATACCTC GCTCTGCTAA TCCTGTTACC AGTGGCTGCT 
4741 GCCAGTGGCG ATAAGTCGTG TCTTACCGGG TTGGACTCAA GACGATAGTT ACCGGATAAG 
4801 GCGCAGCGGT CGGGCTGAAC GGGGGGTTCG TGCACACAGC CCAGCTTGGA GCGAACGACC 
4861 TACACCGAAC TGAGATACCT ACAGCGTGAG CTATGAGAAA GCGCCACGCT TCCCGAAGGG 
35 4921 AGAAAGGCGG ACAGGTATCC GGTAAGCGGC AGGGTCGGAA CAGGAGAGCG CACGAGGGAG 
4981 CTTCCAGGGG GAAACGCCTG GTATCTTTAT AGTCCTGTCG GGTTTCGCCA CCTCTGACTT 
5041 GAGCGTCGAT TTTTGTGATG CTCGTCAGGG GGGCGGAGCC TATGGAAAAA CGCCAGCAAC 
5101 GCGGCCTTTT TACGGTTCCT GGGCTTTTGC TGGCCTTTTG CTCACATGTT CTTTCCTGCG 
5161 TTATCCCCTG ATTCTGTGGA TAACCGTATT ACCGCCTTTG AGTGAGCTGA TACCGCTCGC 
40 5221 CGCAGCCGAA CGACCGAGCG CAGCGAGTCA GTGAGCGAGG AAGCGGAAG 



